Estimating support for protein-protein interaction data with applications to function prediction.
نویسندگان
چکیده
Almost every cellular process requires the interactions of pairs or larger complexes of proteins. High throughput protein-protein interaction (PPI) data have been generated using techniques such as the yeast two-hybrid systems, mass spectrometry method, and many more. Such data provide us with a new perspective to predict protein functions and to generate protein-protein interaction networks, and many recent algorithms have been developed for this purpose. However, PPI data generated using high throughput techniques contain a large number of false positives. In this paper, we have proposed a novel method to evaluate the support for PPI data based on gene ontology information. If the semantic similarity between genes is computed using gene ontology information and using Resnik's formula, then our results show that we can model the PPI data as a mixture model predicated on the assumption that true protein-protein interactions will have higher support than the false positives in the data. Thus semantic similarity between genes serves as a metric of support for PPI data. Taking it one step further, new function prediction approaches are also being proposed with the help of the proposed metric of the support for the PPI data. These new function prediction approaches outperform their conventional counterparts. New evaluation methods are also proposed.
منابع مشابه
Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملPrediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks
Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...
متن کاملPrediction of Coffee Effects in Rats with Healthy and NAFLD Conditions Based on Protein-Protein Interaction Network Analysis
Background and objectives: Non-alcoholic fatty liver disease (NAFLD) is a common liver condition. On the other hand, coffee consumption has shown promising for gastrointestinal diseases. Detection of the most valuable biomarkers of decaffeinated coffee treatment in healthy and non-alcoholic fatty liver disease conditions was the aim of the present study. Methods:</stro...
متن کاملDiscovering Domains Mediating Protein Interactions
Background: Protein-protein interactions do not provide any direct information regarding the domains within the proteins that mediate the interactions. The majority of proteins are multi domain proteins and the interaction between them is often defined by the pairs of their domains. Most of the former studies focus only on interacting domain pairs. However they do not consider the in...
متن کاملP-30: The Effect of The T26248G Polymorphism on Putative MethyltransferaseNsun7 Protein Function and Its Role in Male Infertility
Background: Male infertility has many causes, including genetic infertility. The NOP2/Sun domain family, member7 (Nsun7) gene, which encodes putative methyltransferase Nsun7, has a role in sperm motility. The aim of the present study was to investigate the effect of the T26248G polymorphism on Nsun7 protein function and its role in male infertility. Materials and Methods: Semen samples were col...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational systems bioinformatics. Computational Systems Bioinformatics Conference
دوره 7 شماره
صفحات -
تاریخ انتشار 2008